15 research outputs found

    How Hard Is It to Satisfy (Almost) All Roommates?

    Get PDF
    The classic Stable Roommates problem (the non-bipartite generalization of the well-known Stable Marriage problem) asks whether there is a stable matching for a given set of agents, i.e. a partitioning of the agents into disjoint pairs such that no two agents induce a blocking pair. Herein, each agent has a preference list denoting who it prefers to have as a partner, and two agents are blocking if they prefer to be with each other rather than with their assigned partners. Since stable matchings may not be unique, we study an NP-hard optimization variant of Stable Roommates, called Egal Stable Roommates, which seeks to find a stable matching with a minimum egalitarian cost gamma, i.e. the sum of the dissatisfaction of the agents is minimum. The dissatisfaction of an agent is the number of agents that this agent prefers over its partner if it is matched; otherwise it is the length of its preference list. We also study almost stable matchings, called Min-Block-Pair Stable Roommates, which seeks to find a matching with a minimum number beta of blocking pairs. Our main result is that Egal Stable Roommates parameterized by gamma is fixed-parameter tractable, while Min-Block-Pair Stable Roommates parameterized by beta is W[1]-hard, even if the length of each preference list is at most five

    Improving Grounded Natural Language Understanding through Human-Robot Dialog

    Full text link
    Natural language understanding for robotics can require substantial domain- and platform-specific engineering. For example, for mobile robots to pick-and-place objects in an environment to satisfy human commands, we can specify the language humans use to issue such commands, and connect concept words like red can to physical object properties. One way to alleviate this engineering for a new domain is to enable robots in human environments to adapt dynamically---continually learning new language constructions and perceptual concepts. In this work, we present an end-to-end pipeline for translating natural language commands to discrete robot actions, and use clarification dialogs to jointly improve language parsing and concept grounding. We train and evaluate this agent in a virtual setting on Amazon Mechanical Turk, and we transfer the learned agent to a physical robot platform to demonstrate it in the real world

    A Domain-Agnostic Approach for Characterization of Lifelong Learning Systems

    Full text link
    Despite the advancement of machine learning techniques in recent years, state-of-the-art systems lack robustness to "real world" events, where the input distributions and tasks encountered by the deployed systems will not be limited to the original training context, and systems will instead need to adapt to novel distributions and tasks while deployed. This critical gap may be addressed through the development of "Lifelong Learning" systems that are capable of 1) Continuous Learning, 2) Transfer and Adaptation, and 3) Scalability. Unfortunately, efforts to improve these capabilities are typically treated as distinct areas of research that are assessed independently, without regard to the impact of each separate capability on other aspects of the system. We instead propose a holistic approach, using a suite of metrics and an evaluation framework to assess Lifelong Learning in a principled way that is agnostic to specific domains or system techniques. Through five case studies, we show that this suite of metrics can inform the development of varied and complex Lifelong Learning systems. We highlight how the proposed suite of metrics quantifies performance trade-offs present during Lifelong Learning system development - both the widely discussed Stability-Plasticity dilemma and the newly proposed relationship between Sample Efficient and Robust Learning. Further, we make recommendations for the formulation and use of metrics to guide the continuing development of Lifelong Learning systems and assess their progress in the future.Comment: To appear in Neural Network

    Distributed Constraint Optimization for Mobile Sensor Teams (Doctoral Consortium)

    No full text
    Coordinating a mobile sensing agents (MST) to adequately position themselves with regards to points of interest generally called targets (e.g., disaster survivors, military targets, or pollution spills), is a challenging problem in many multiagent applications. Such applications are inherently dynamic due to changes in the environment, technology failures, and incomplete knowledge of the agents. Agents must adaptively respond by changing their locations to continually optimize the coverage of targets. Optimally choosing where to position agents to meet the coverage requirements in a static setting is a known NP-hard optimization problem. Doing so in a dynamic distributed environment is a challenging task. In this work I continue to develop and study the DCOP MST model DCOP is a general model of distributed multi-agent coordination. A DCOP is constituted of agents, variables, and (soft and hard) constraints between sets of variables that reflect the costs of assignments to the variables. Each agent has exclusive control over a subset of the variables and knows information relevant to its variables, such as the values that can be assigned to them (their domains) and the constraints involving them. The goal is to select an assignment of values to the variables that minimizes the aggregated costs of the constraints. In many ways DCOPs are a natural fit for MST applications, which are inherently decentralized. However, DCOPs fall short in two ways. First, constraints in a MST problem may involve all agents which can result in an exponential-sized constraint structure, which is difficult to solve. Second, DCOP is a static model. In contrast, the coverage problem confronting the agents in realistic applications is highly dynamic. There are three types of dynamism in MST applications: changes in the environment external to the agents, including targets arising, moving, and disappearing, or target coverage requirements being modified by an outside authority; changes inherent to the agents, including sensor failures resulting in targets being missed or false information being disseminated; and changes in the agents' knowledge of the environment, such as the presence of tar- gets and the quality with which they can be sensed from different locations. In DCOP MST, agents maintain variables for their physical positions, while each target is represented by a constraint that reflects the quality of coverage of that target. In contrast to conventional, static DCOP, DCOP MST not only permits dynamism but exploits it by restricting variable domains to nearby locations; consequently, variable domains and constraints change as the agents move through the environment. DCOP MST confers three major advantages. It directly represents the multiple forms of dynamism inherent in MSTs. It also provides a compact representation that can be solved efficiently with local search algorithms, with information and communication locality based on physical locality as typically occurs in MST applications. Finally, DCOP MST facilitates organization of the team into multiple sub-teams that can specialize in different roles and coordinate their activity through dynamic events. We demonstrate how a search-and-detection team responsible for finding new targets and a surveillance sub-team tasked with coverage of known targets can effectively work together to improve performance while using the DCOP MST framework to coordinate. We propose different algorithms to meet the specific needs of each sub-team and several methods for cooperation between sub-teams. For the search-and-detection team, we develop an algorithm based on DSA that forces intensive exploration for new targets. For the surveillance sub-team, we adapt several well-known incomplete DCOP algorithms, including the Maximum Gain Messages (MGM) algorithm, the Distributed Stochastic Algorithm (DSA) and the Max-sum algorithm which requires us to develop an efficient method for agents to find the value assignment in their local environment, which is optimal in minimizing the maximum unmet coverage requirement over all targets. In order to avoid an exponential constraint network, instead of choosing from among all possible locations, each agent considers only nearby locations. Constraints thus do not need to involve all agents at all times but only the agents who are close enough to possibly cover the target. The disadvantage of dynamic domains based on physical locality is that adaptations of standard local search algorithms tend to become trapped in local optima where targets beyond the immediate range of the agents go uncovered. To address this shortcoming we develop exploration methods to be used with the local search algorithms. In designing the algorithms that the agents run, we must balance 172

    Explorative Max-sum for Teams of Mobile Sensing Agents

    No full text
    Multi-agent applications that include teams of mobile sensing agents are challenging since they are inherently dynamic and a single movement of a mobile sensor can change the problem that the whole team is facing. While agents select their positions with respect to the information available to them in their local environment, by moving to a different location they can reveal new information, e.g., targets, which they were not aware of before. Thus, exploration is required for such information to be revealed. A variation of the DCOP model (DCOP_MST) was previously adjusted to represent such problems along with local search algorithms that were enhanced with exploration methods. In this paper we design an explorative version of Max-sum for solving DCOP_MST, which is based on an iterative process where, at each iteration, agents generate and solve a specific problem instance. We demonstrate that this basic algorithm (Max-sum_MST) converges faster than other standard local search algorithms thatwere adjusted to solve DCOP_MSTs, however, its exploitive naturemakes it inferior to explorative local search algorithms. Thus, we designed exploration methods that when combined with basic Max-sum_MST, significantly outperform the existing explorative local search algorithms. Moreover, the best performing method we propose also eliminates the exponential time complexity of Max-sum by bounding the number of agents involved in each constraint

    A Penny for Your Thoughts: The Value of Communication in Ad Hoc Teamwork

    No full text
    You are viewing a past publication from the Good Systems Network Digest in May 2020Office of the VP for Researc

    Human-Interactive Robot Learning (HIRL)

    No full text
    With robots poised to enter our daily environments, we conjecture that they will not only need to work for people, but also learn from them. An active area of investigation in the robotics, machine learning, and human-robot interaction communities is the design of teachable robotic agents that can learn interactively from human input. To refer to these research efforts, we use the umbrella term Human-Interactive Robot Learning (HIRL). While algorithmic solutions for robots learning from people have been investigated in a variety of ways, HIRL, as a fairly new research area, is still lacking: 1) a formal set of definitions to classify related but distinct research problems or solutions, 2) benchmark tasks, interactions, and metrics to evaluate the performance of HIRL algorithms and interactions, and 3) clear long-term research challenges to be addressed by different communities. The main goal of this workshop will be to consolidate relevant recent work falling under the HIRL umbrella into a coherent set of long, medium, and short-term research problems, and identify the most pressing future research goals in this area. As HIRL is a developing research area, this workshop is an opportunity to break the existing boundaries between relevant research communities by developing and sharing a diverse set of benchmark tasks and metrics for HIRL, inspired by other fields including neuroscience, biology, and ethics research
    corecore